RDFPRO: an extensible tool for building stream-oriented RDF processing pipelines

نویسندگان

  • Francesco Corcoglioniti
  • Marco Rospocher
  • Marco Amadori
  • Michele Mostarda
چکیده

We present RDFPRO (RDF Processor), an open source Java command line tool and embeddable library that offers a suite of stream-oriented, highly optimized processors for common tasks such as data filtering, RDFS inference, smushing and statistics extraction. RDFPRO processors are extensible by users and can be freely composed to form complex pipelines to efficiently process RDF data in one or more passes. We show how RDFPRO model and multi-threaded design allow processing billions of triples in few hours in a typical Linked Open Data integration scenario, and discuss relevant implementation aspects and lessons learnt.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Demonstrating the Power of Streaming and Sorting for Non-distributed RDF Processing: RDFpro

We demonstrate RDFpro (RDF Processor), an extensible, generalpurpose, open source tool for processing large RDF datasets on a commodity machine leveraging streaming and sorting techniques. RDFpro provides out-of-thebox implementations – called processors – of common tasks such as data filtering, rule-based inference, smushing, and statistics extraction, as well as easy ways to add new processor...

متن کامل

Change-Resilient Design and Dataflow Optimization for Distributed XML Stream Processors

We propose a new stream-processing framework based on a virtual assembly line (val) model. We instantiate the val framework obtaining ∆-XML, an approach for designing and optimizing distributed XML processing pipelines. val/∆-XML greatly simplifies the design of change-resilient dataflow pipelines: XML processors (called actors) can be inserted, deleted, and their “scope of work” (the parts of ...

متن کامل

Implementation and Experiments of an Extensible Parallel Processing System Supporting User Defined Database Operations

This paper presents an implementation method and experimental results of an extensible parallel processing system for databases. We have already proposed a stream-oriented parallel processing scheme (stream-oriented ncheme) of basic operations for databases and knowledge bases. This scheme is based on the demand-driven evaluation incorporating stream processing. We have designed basic primitive...

متن کامل

Integrating Xml and Rdf Concepts to Achieve Automation within a Tactical Knowledge Management Environment

Since the advent of Naval Warfare, Tactical Knowledge Management (KM) has been critical to the success of the On Scene Commander. Today’s Tactical Knowledge Manager typically operates in a high stressed environment with a multitude of knowledge sources including detailed sensor deployment plans, rules of engagement contingencies, and weapon delivery assignments. However the WarFighter has place...

متن کامل

Study on Baseflow Separation of "Abolabas River” Using ADUKIH and RDF Methods

Objective: Currently, the evaluation of baseflow components have been of a worldwide concern due to the influential role of streamflow (base flow and direct flow)in agriculture, water sources management as well as supplying the potable water. Direct and field measurement of baseflow is not practicable especially in large areas with statistics deficiencies. Also, this would not be economically e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014